Parallelization Techniques for Sparse Matrix Applications

نویسندگان

  • Manuel Ujaldon
  • Emilio L. Zapata
  • Shamik D. Sharma
  • Joel H. Saltz
چکیده

Sparse matrix problems are diicult to parallelize eeciently on distributed memory machines since data is often accessed indirectly. Inspector/executor strategies, which are typically used to parallelize loops with indirect references, incur substantial run-time preprocessing overheads when references with multiple levels of indirection are encountered | a frequent occurrence in sparse matrix algorithms. The sparse array rolling (SAR) technique, introduced in 15], signiicantly reduces these preprocessing overheads. This paper outlines the SAR approach and describes its runtime support accompanied by a detailed performance evaluation. The results demonstrate that SAR yields signiicant reduction in preprocessing overheads compared to standard inspector/executor techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Run-time Parallelization Techniques for Sparse Applications

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we have introduced a novel framework for their identification: speculative parallelization. While we have previously shown that this method is inherently scalable its p...

متن کامل

Web-Site-Based Partitioning Techniques for Efficient Parallelization of the PageRank Computation

The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. PageRank computation includes repeated iterative sparse matrix-vector multiplications. Due to the enourmous size of the Web matrix to be multiplied, PageRank computations are usually carried out on parallel systems. Graph and hypergraph par...

متن کامل

Domain Decomposition Based High Performance Parallel Computing

The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition t...

متن کامل

Martin Köhler Jens Saak Efficiency improving implementation techniques for large scale matrix equation solvers CSC / 09 - 10 Chemnitz Scientific Computing Preprints

We address the important field of large scale matrix based algorithms in control and model order reduction. Many important tools from theory and applications in systems theory have been widely ignored during the recent decades in the context of PDE constraint optimal control problems and simulation of electric circuits. Often this is due to the fact that large scale matrices are suspected to be...

متن کامل

Applicability of Program Comprehension to Sparse Matrix Computations

Space{eecient data structures for sparse matrices typically yield programs in which not all data dependencies can be determined at compile time. Automatic pa-rallelization of such codes is usually done at run time, e.g. by applying the inspector{ executor technique, incurring tremendous overhead. | Program comprehension techniques have been shown to improve automatic parallelization of dense ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 1996